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Rhizobium leguminosarum bv. trifolii SRDI565 (syn. N8-J) is an aerobic, motile, Gram- 
negative, non-spore-forming rod. SRDI565 was isolated from a nodule recovered from the 
roots of the annual clover Trifolium subterraneum subsp. subterraneum grown in the green- 
house and inoculated with soil collected from New South Wales, Australia. SRDI565 has a 
broad host range for nodulation within the clover genus, however N 2 -fixation is sub-optimal 
with some Trifolium species and ineffective with others. Here we describe the features of R. 
leguminosarum bv. trifolii strain SRDI565, together with genome sequence information and 
annotation. The 6,905,599 bp high-quality-draft genome is arranged into 7 scaffolds of 7 
contigs, contains 6,750 protein-coding genes and 86 RNA-only encoding genes, and is one of 
100 rhizobial genomes sequenced as part of the DOE Joint Genome Institute 2010 Genomic 
Encyclopedia for Bacteria and Archaea-Root Nodule Bacteria (GEBA-RNB) project. 



Introduction 



Plant available nitrogen is a precious commodity 
in many agricultural soils and the most commonly 
limiting nutrient in plant growth. The supply of 
plant available nitrogen to nitrogen (N) -deficient 
farming systems is thus vital to productivity [1]. 
The application of industrially fixed nitrogenous 
fertilizer can meet the demand for N. However, 
this is a costly option as the price of nitrogenous 
fertilizer is connected to the cost of fossil fuels re- 
quired for its production. Furthermore, the use of 
nitrogenous fertilizer contributes to greenhouse 
gas emissions and pollution of the environment. A 
more environmentally sustainable option is to ex- 
ploit the process of biological nitrogen fixation 
that occurs in the symbiosis between legumes and 
rhizobia [2]. 



In this symbiotic association, rhizobia reduce at- 
mospheric dinitrogen (N2) into bioavailable N that 
can be used by the plant for growth. Pasture leg- 
umes, including the clovers that comprise the Tri- 
folium genus, are major contributors of biological- 
ly fixed N2 to mixed farming systems throughout 
the world [3,4]. In Australia, soils with a history of 
growing Trifolium spp. have developed large and 
symbiotically diverse populations of Rhizobium 
leguminosarum bv. trifolii [R. 1. trifolii) that are 
able to infect and form nodules on a range of clo- 
ver species. The N2-fixation capacity of the symbi- 
oses established by different combinations of clo- 
ver hosts {Trifolium spp.) and strains of R. 1. trifolii 
can vary from 10 to 130% when compared to an 
effective host-strain combination [3-9]. 
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R. 1. thfolii strain SRDI565 (syn. N8-J [10]) was 
isolated from a nodule recovered from the roots of 
the annual clover Thfolium subterraneum subsp. 
subterraneum that had been inoculated with soil 
collected from under a mixed pasture stand from 
Tumet, New South Wales, Australia and grown in 
N deficient media for four weeks after inoculation, 
in the greenhouse. SRDI565 was first noted for its 
sub-optimal N2-fixation capacity on T. 
subterraneum cv. Campeda (<60% of that with 
strain WSM1325) and formation of white (Fix-) 
pseudo-nodules on T. subterraneum cv. Clare 
[10,11]. Here we present a preliminary descrip- 
tion of the general features for R. leguminosarum 
bv. thfolii strain SRDI565 together with its ge- 
nome sequence and annotation. 

Classification and general features 

R. 1. trifolii strain SRDI565 is a motile, Gram- 
negative rod (Figure 1 Left and Center) in the or- 
der Rhizobiales of the class Alphaproteobacteria. 
It is fast growing, forming colonies within 3-4 days 
when grown on half strength Lupin Agar (VzLA) 
[12] at 28°C. Colonies on VzLA are white-opaque, 
slightly domed and moderately mucoid with 
smooth margins (Figure 1 Right). 

Symbiotaxonomy 

R. I. trifolii SRDI565 forms nodules on (Nod + ), and 
fixes N2 (Fix + ) with, a range of annual and peren- 
nial clover species of Mediterranean origin (Table 
2). SRDI565 forms white, ineffective (Fix ) nodules 
with annual clovers T. glanduliferum and T. 
subterraneum cv. Clare, and with the perennial 
clovers T. pratense and T. polymorphum. SRDI565 
does not form nodules on T. vesiculosum. 



Genome sequencing and annotation 
information 

Genome project history 

This organism was selected for sequencing on the 
basis of its environmental and agricultural rele- 
vance to issues in global carbon cycling, alterna- 
tive energy production, and biogeochemical im- 
portance, and is part of the Community Sequenc- 
ing Program at the U.S. Department of Energy, 
Joint Genome Institute (JGI) for projects of rele- 
vance to agency missions. The genome project is 
deposited in the Genomes OnLine Database [30] 
and an improved-high-quality-draft genome se- 
quence in IMG. Sequencing, finishing and annota- 
tion were performed by the JGI. A summary of the 
project information is shown in Table 3. 

Minimum Information about the Genome Se- 
quence (MIGS) is provided in Table 1. Figure 2 
shows the phylogenetic neighborhood of R. 1. 
trifolii strain SRDI565 in a 16 S rRNA sequence 
based tree. This strain clusters closest to R. I. 
trifolii T24 and Rhizobium leguminosarum bv. 

phaseoli RRE6 with 99.8% and 99.6% sequence 
identity, respectively. 

Growth conditions and DNA isolation 

Rhizobium leguminosarum bv. trifolii strain 

SRDI565 was cultured to mid logarithmic phase in 
60 ml of TY rich media [31] on a gyratory shaker 

at 28°C. DNA was isolated from the cells using a 
CTAB (Cetyl trimethyl ammonium bromide) bac- 
terial genomic DNA isolation method [32]. 




500 11111 




Figure 1. Images of Rhizobium leguminosarum bv. trifolii strain SRDI565 using scanning (Left) and trans- 
mission (Center) electron microscopy as well as light microscopy to show the colony morphology on solid 
media (Right). 
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Table 1. Classification and general features of Rhizobium leguminosarum bv. trifolii SRDI565 according to the 
MIGS recommendations [13] 



MIGS ID 


Property 


Term 


Evidence code 






Domain Bacteria 


TAS [13,14] 






Phylum Proteobacteria 


TAS [15] 






Class Alphaproteobacteria 


TAS [16] 




Current classification 


Order Rhizobiales 
Family Rhizobiaceae 
Genus Rhizobium 

Species Rhizobium leguminosarum bv. trifolii 


TAS [17,18] 
TAS [19,20] 
TAS [19,21-24] 
TAS [19,21,24,25] 




Gram stain 


Negative 


IDA 




Cell shape 


Rod 


IDA 




Motility 


Motile 


IDA 




Sporulation 


Non-sporulating 


NAS 




Temperature range 


Mesophile 


NAS 




Optimum temperature 


28°C 


NAS 




Salinity 


Non-halophile 


NAS 


MIGS-22 


Oxygen requirement 


Aerobic 


TAS [11] 




Carbon source 


Varied 


NAS 




Energy source 


C he m oorg a not rop h 


NAS 


MIGS-6 


Habitat 


Soil, root nodule, on host 


TAS [10] 


MIGS-15 


Biotic relationship 


Free living, symbiotic 


TAS [10] 


MIGS-14 


Pathogenicity 


Non-pathogenic 


NAS 




Biosafety level 


1 


TAS [2 6] 




Isolation 


Root nodule 


TAS [10] 


MIGS-4 


Geographic location 


NSW, Australia 


TAS [10] 


MIGS-5 


Soil collection date 


Dec, 1998 


IDA 


MIGS-4.1 


Longitude 


148.25 


IDA 


MIGS-4.2 


Latitude 


-35.32 


IDA 


MIGS-4.3 


Depth 


0-1 0cm 




MIGS-4.4 


Altitude 


Not recorded 





Evidence codes - IDA: Inferred from Direct Assay; TAS: Traceable Author Statement (i.e., a direct report ex- 
ists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated 
sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence 
codes are from the Gene Ontology project [2 7]. 
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Rhizobium multihospitium CCBAU 83401 (EF035074) 
- Rhizobium miluonense CCBAU 41 251 T (EF061 096) 
i- Rhizobium leguminosarum bv. trifolii SRDI565 (Gi08843) 



) I Rhiz, 



90 l Rhizobium leguminosarum bv. trifoliiJ24 (U31 074) 
- Rhizobium etli USDA 9032 T (U2891 6) 



43 
70 



Rhizobium p/'s/' DSM 30132' (AY509899) 
-I- Rhizobium phaseoli ATCC 14482 T (EF141340) 

- Rhizobium fabae LMG 23997 T (DQ835306) 

- Rhizobium tubonense CCBAU 85046 T (EU256434) 

- Rhizobium tibeticum CCBAU 85039 T (EU256404) 



98 j - Rhizobium alamii LMG 24466 1 (AM931436) 

L Rhizobium mesosinicum CCBAU2501 0 T (DQ1 00063) 



Rhizobium mongolense USDA1 844 (U89817, Gi08900) 

Rhizobiumyanglingense SH22623 1 (AF003375) 
Rhizobium soli DS-42 T (EF36371 5) 



■ Rhizobium loessense CCBAU 71 90^ (AF364069) 
i- Rhizobium galegae ga\ 1261 T (X67226, Gi09589) 

98 L Rhizobium vignae CCBAU 051 76 T (GU128881) 



99 



— Rhizobium larrymoorei LMG 21 41 0 1 (NR 02651 9) 
■ Rhizobium radiobacter ATCC 19358 T (AJ389904) 



Rhizobiumvitis ATCC 49767' (X67225, Gi15372) 



100 I— Rhizobium taibaishanense CCNWSX 0483 (HM776997) 
97 . Ensiferkummerowiae CCBAU 71714 T (AF364067) 



99 



Ensifer meliloti LMG 61 33 (X67222) 
Ensifermedicae LMG 19920 T (L39882) 
Ensifer xinjiangense LMG 17930 T (D12796) 
78l Ens/ferfrec///LMG6217 T (X67231 ) 
Ensiferterangae LMG 7834 T (X68388) 

68 |— Mesorhizobium gobiense LMG 23949 T (EF035064) 

Mesorhizobium loti USDA 3471 T (X67229, Gi08881) 
i— Mesorhizobium septentrionale SDW 014 T (AF508207) 

Mesorhizobium plurifarium LMG 1 1 892 T (Y141 58) 

Mesorhizobium opportunistum WSM2075 T (CP002279, Gc01853)* 



99 



66 



t 

1 1 — i 



100 



50« — Mesorhizobium huakuiiLUG 141 07 T (D1 3431) 
Bradyrhizobium elkanii USDA 76 T (AF362942, Gi08850) 

I — Bradyrhizobium canariense LMG 22265 T (AJ558025) 



100 



, i- Bradyrhizobium liaoningense LMG 1 8230 T (AJ2508 1 3) 

54~l i 



• Bradyrhizobium yuanmingense LMG 21827 T (AF193818) 
■ Azorhizobiu m caulinodans ORS 571 T (X67221, Gc00669)* 



001 



Figure 2. Phylogenetic tree showing the relationship of Rhizobium leguminosarum bv. trifolii SRDI565 
(shown in blue print) with some of the root nodule bacteria in the order Rhizobiales based on aligned se- 
quences of the 16S rRNA gene (1,307 bp internal region). All sites were informative and there were no gap- 
containing sites. Phylogenetic analyses were performed using MEGA, version 5.05 [28]. The tree was built 
using the maximum likelihood method with the General Time Reversible model. Bootstrap analysis [29] with 
500 replicates was performed to assess the support of the clusters. Type strains are indicated with a super- 
script T. Strains with a genome sequencing project registered in GOLD [30] are in bold print and the GOLD 
ID is shown after the accession number. Published genomes are indicated with an asterisk. 
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Table 2. Compatibility of SRDI565 with eleven Trifolium genotypes for nodulation 


(Nod) and N 2 - 


-Fixation (Fix) 


Species name 


Cultivar 


Common Name 


Growth Type 


Nod 


Fix 


Reference 


T. gland ul if e rum Boiss. 


Prima 


Gland 


Annual 


+(w) 






T. michelianum Savi. 


Bolta 


Balansa 


Annual 


+ 


+ 




T. purpureum Loisel 


Pa ratta 

1 Cl 1 Cl L LCI 


Purp le 


Annua 1 


+ 




mi 


T. resupinatum L. 


Kyambro 


Persian 


Annual 


+ 


+ 




T. subterraneum L. 


Campeda 


Sub. clover 


Annual 


+ 


+ 


[10,11] 


T. subterraneum L. 


Clare 


Sub. clover 


Annual 


+(w) 




[10,11] 


T. vesiculos um Savi. 


A r rot as 


Arrowleaf 


Annual 


- 






T. fragrferum L. 


Palestine 


Strawberry 


Perennial 


+ 


+ 




ml I r> 

/. polymorphum Poir 


Acc.#0871 02 


Polymorphous 


Perennial 


+(w) 




[11] 


T. pratense L. 




Red 


Perennial 


+(w) 






T. re pens L. 


Haifa 


White 


Perennial 


+ 


+ 





(w) indicates nodules present were white. 



Genome sequencing and assembly 

The genome of Rhizobium leguminosarum bv. 
trifolii strain SRDI565 was sequenced at the Joint 
Genome Institute (JGI) using Illumina [33] data. 
An Illumina short-insert paired-end library with 
an average insert size of 243 + 58 bp was used to 
generate 18,700,764 reads and an Illumina long- 
insert paired-end library with an average insert 
size of 8,446 + 2,550 bp was used to generate 
21,538,802 reads totalling 6,036 Mbp of Illumina 
data (unpublished, Feng Chen). 

All general aspects of library construction and se- 
quencing performed at the JGI can be found at the 
JGI user homepage [34]. The initial draft assembly 
contained 22 contigs in 16 scaffolds. The initial 
draft data was assembled with Allpaths, version 
39750, and the consensus was computationally 
shredded into 10 Kb overlapping fake reads 
(shreds). The Illumina draft data was also assem- 
bled with Velvet, version 1.1.05 [35], and the con- 
sensus sequences were computationally shredded 
into 1.5 Kb overlapping fake reads (shreds). The 



Illumina draft data was assembled again with Vel- 
vet using the shreds from the first Velvet assembly 
to guide the next assembly. The consensus from 
the second VELVET assembly was shredded into 
1.5 Kb overlapping fake reads. The fake reads 
from the Allpaths assembly and both Velvet as- 
semblies and a subset of the Illumina CLIP paired- 
end reads were assembled using parallel phrap, 
version 4.24 (High Performance Software, LLC). 
Possible mis-assemblies were corrected with 
manual editing in Consed [36-38]. Gap closure 
was accomplished using repeat resolution soft- 
ware (Wei Gu, unpublished), and sequencing of 
bridging PCR fragments with PacBio (un- 
published, Cliff Han) technology. For improved 
high quality draft, 4 PCR PacBio consensus se- 
quences were completed to close gaps and to raise 
the quality of the final sequence. The estimated 
total size of the genome is 7 Mb and the final as- 
sembly is based on 6,036 Mb of Illumina draft da- 
ta, which provides an average 862 x coverage of 
the genome. 
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Genome annotation 

Genes were identified using Prodigal [39] as part 
of the DOE-JGI annotation pipeline [40], followed 
by a round of manual curation using the JGI 
GenePRIMP pipeline [41]. The predicted CDSs 
were translated and used to search the National 
Center for Biotechnology Information (NCBI) non- 
redundant database, UniProt, TIGRFam, Pfam, 
PRIAM, KEGG, COG, and InterPro databases. These 
data sources were combined to assert a product 
description for each predicted protein. Non- 
coding genes and miscellaneous features were 
predicted using tRNAscan-SE [42], RNAMMer [43], 
Rfam [44], TMHMM [45], and SignalP [46]. Addi- 



tional gene prediction analyses and functional an- 
notation were performed within the Integrated 
Microbial Genomes (IMG-ER) platform [47,48]. 

Genome properties 

The genome is 6,905,599 nucleotides with 60.67% 
GC content (Table 4) and comprised of 7 scaffolds 
(Figures 3 / 4 / 5 / 6 / 7,8 / and 9) of 7 contigs. From a 
total of 6,836 genes, 6,750 were protein encoding 
and 86 RNA-only encoding genes. The majority of 
genes (77.98%) were assigned a putative function 
whilst the remaining genes were annotated as hy- 
pothetical. The distribution of genes into COGs 
functional categories is presented in Table 5. 



Table 3. Genome sequencing project information for Rhizobium leguminosarum bv. trifolii strain SRDI565. 
MIGS ID Property Term 



MIGS-31 Finishing quality 



Improved high-quality draft 



MIGS-28 Libraries used 



2x lllumina libraries; Std short PE & CLIP long PE 



MIGS-29 Sequencing platforms lllumina HiSeq 2000, PacBio 



MIGS-31. 2 Sequencing coverage 862x lllumina 



MIGS-30 Assemblers 



with Allpaths, version 39750, Velvet 1.015, phrap 4.24 



MIGS-32 Gene calling methods Prodigal 1.4, GenePRIMP 



GOLD ID 



Gi08843 



NCBI project ID 



81743 



Database: IMG 



2517287029 



Project relevance 



Symbiotic N 2 fixation, agriculture 
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Table 4. Genome Statistics for Rhizobium 


legum inosarum bv. 


triToni dKUidod 


Attribute 


Value 


% of Total 


Genome size (bp) 


6,905,599 


100.00 


DNA coding region (bp) 


5,960,775 


86.32 


DNA G+C content (bp) 


4,189,855 


60.67 


Number of scaffolds 


7 




Number of contigs 


7 




Tota 1 gpnp 


6,836 


100.00 


RNA genes 


86 


1.26 


rRNA onerons* 


3 




Protein-coding genes 


6,750 


98.74 


Genes with function prediction 


5,331 


77.98 


Genes assigned to COGs 


5,330 


77.97 


VJCI ICj ClOD IcL 1 ICU 1 1 d III UUIIIdlllj 




80 97 


Genes with signal peptides 


603 


8.82 


Genes with transmembrane helices 


1,552 


22.70 


CRISPR repeats 


0 






Figure 3. Graphical map of the genome of Rhizobium leguminosarum bv. trifolii strain 
SRDI565 (scaffold 1.1). From bottom to the top of each scaffold: Genes on forward 
strand (color by COG categories as denoted by the IMG platform), Genes on reverse 
strand (color by COG categories), RNA genes (tRNAs green, sRNAs red, other RNAs 
black), GC content, GC skew. 




O ^ H W ^ 05 iE 

Figure 4. Graphical map of the genome of Rhizobium leguminosarum bv. trifolii strain 
SRDI565 (scaffold 2.2). From bottom to the top of each scaffold: Genes on forward 
strand (color by COG categories as denoted by the IMG platform), Genes on reverse 
strand (color by COG categories), RNA genes (tRNAs green, sRNAs red, other RNAs 
black), GC content, GC skew. 
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Figure 5. Graphical map of the genome of Rhizobium leguminosarum bv. trifolii 
strain SRDI565 (scaffold 3.3). From bottom to the top of each scaffold: Genes on 
forward strand (color by COG categories as denoted by the IMG platform), Genes 
on reverse strand (color by COG categories), RNA genes (tRNAs green, sRNAs red, 
other RNAs black), GC content, GC skew. 




Figure 6. Graphical map of the genome of Rhizobium leguminosarum bv. trifolii 
strain SRDI565 (scaffold 4.4). From bottom to the top of each scaffold: Genes on 
forward strand (color by COG categories as denoted by the IMG platform), Genes 
on reverse strand (color by COG categories), RNA genes (tRNAs green, sRNAs red, 
other RNAs black), GC content, GC skew. 




Figure 7. Graphical map of the genome of Rhizobium leguminosarum bv. trifolii 
strain SRDI565 (scaffold 5.5). From bottom to the top of each scaffold: Genes on 
forward strand (color by COG categories as denoted by the IMG platform), Genes 
on reverse strand (color by COG categories), RNA genes (tRNAs green, sRNAs red, 
other RNAs black), GC content, GC skew. 
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Table 5. Number of protein coding genes of Rhizobium leguminosarum bv. trifolii SRDI 565 
associated with the general COG functional categories. 



Code 


Value 


%age 


Description 


J 


191 


3.22 


Translation, ribosomal structure and biogenesis 


A 


0 


0.00 


RNA processing and modification 


K 


574 


9.67 


Transcription 


L 


189 


3.19 


Replication, recombination and repair 


B 


3 


0.05 


Chromatin structure and dynamics 


D 


41 


0.69 


Cell cycle control, mitosis and meiosis 


Y 


0 


0.00 


Nuclear structure 


V 


70 


1.18 


Defense mechanisms 


T 


320 


5.39 


Signal transduction mechanisms 


M 


315 


5.31 


Cell wall/membrane biogenesis 


N 


81 


1.37 


Cell motility 


Z 


0 


0.00 


Cytoskeleton 


w 


0 


0.00 


r a Ill x ■ 

Extracellular structures 


u 


96 


1.62 


» III c C 1 * 1 * 

Intracellular trafficking and secretion 


o 


208 


3.51 


Posttranslational modification, protein turnover, chaperones 


c 


326 


5.49 


Energy production conversion 


G 


633 


10.67 


Carbohydrate transport and metabolism 


E 


591 


9.96 


Amino acid transport metabolism 


F 


109 


1.84 


Nucleotide transport and metabolism 


H 


193 


3.25 


Coenzyme transport and metabolism 


1 


216 


3.64 


Lipid transport and metabolism 


P 


2 72 


4.58 


Inorganic ion transport and metabolism 


Q 


148 


2.49 


Secondary metabolite biosynthesis, transport and catabolism 


R 


758 


12.77 


General function prediction only 


S 


600 


10.11 


Function unknown 




1,506 


22.03 


Not in COGS 




Figure 8. Graphical map of the genome of Rhizobium leguminosarum bv. trifolii 
strain SRDI565 (6.6). From bottom to the top of each scaffold: Genes on forward 
strand (color by COG categories as denoted by the IMG platform), Genes on re- 
verse strand (color by COG categories), RNA genes (tRNAs green, sRNAs red, 
other RNAs black), GC content, GC skew. 
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Figure 9. Graphical map of the genome of Rhizobium leguminosarum bv. trifolii strain SRDI565 (7.7). From 
bottom to the top of each scaffold: Genes on forward strand (color by COG categories as denoted by the 
IMG platform), Genes on reverse strand (color by COG categories), RNA genes (tRNAs green, sRNAs red, 
other RNAs black), GC content, GC skew. 
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